Fast Text Compression with Neural Networks

نویسنده

  • Matthew V. Mahoney
چکیده

Neural networks have the potential to extend data compression algorithms beyond the character level n-gram models now in use, but have usually been avoided because they are too slow to be practical. We introduce a model that produces better compression than popular Limpel-Ziv compressors (zip, gzip, compress), and is competitive in time, space, and compression ratio with PPM and BurrowsWheeler algorithms, currently the best known. The compressor, a bit-level predictive arithmetic encoder using a 2 layer, 4 × 10 by 1 network, is fast (about 10 characters/second) because only 4-5 connections are simultaneously active and because it uses a variable learning rate optimized for one-pass training.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

آموزش شبکه عصبی MLP در فشرده‌سازی تصاویر با استفاده از روش GSA

Image compression is one of the important research fields in image processing. Up to now, different methods are presented for image compression. Neural network is one of these methods that has represented its good performance in many applications. The usual method in training of neural networks is error back propagation method that its drawbacks are late convergence and stopping in points of lo...

متن کامل

Text Compression Via Alphabet Re-Representation

This article introduces the concept of alphabet re-representation in the context of text compression. We consider re-representing the alphabet so that a representation of a character reflects its properties as a predictor of future text. This enables us to use an estimator from a restricted class to map contexts to predictions of upcoming characters. We describe an algorithm that uses this idea...

متن کامل

Edge Preserving Image Compression for Magnetic Resonance Images Using Dann-based Neural Networks

of a master’s thesis at the University of Miami. Thesis supervised by Professor Mansur R. Kabuka. No. of pages in text: 51. With the tremendous growth in imaging applications and the development of filmless radiology, the need for compression techniques which can achieve high compression ratios with user specified distortion rates become necessary. Boundaries and edges in the tissue structures ...

متن کامل

Syntactically Informed Text Compression with Recurrent Neural Networks

We present a self-contained system for constructing natural language models for use in text compression. Our system improves upon previous neural network based models by utilizing recent advances in syntactic parsing – Google’s SyntaxNet – to augment character-level recurrent neural networks. RNNs have proven exceptional in modeling sequence data such as text, as their architecture allows for m...

متن کامل

Locating Anchor Shots in Compression Domain Based on Neural Networks

Anchor shots are important elements in news video, and locating them accurately and thoroughly is crucial to parse news video. The paper presents a novel approach, using neural networks, to detect anchor clips. Firstly, a background model is constructed through neural networks learning. Then, the trained neural networks classify frames in news video into two classes, i.e. anchor frames and non-...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000